
## README for Kenya Analysis Replication Packet

### Overview

This replication packet contains the data and analysis scripts used in the study titled **"Survey Sampling in the Global South Using Facebook Advertisements"**. The analysis is designed to reproduce the findings discussed in the paper, with all necessary datasets and code provided in this folder.

The main analysis script for Kenya, **`kenya_analysis_replicate.R`**, is the starting point for running the entire analysis. It loads the necessary datasets, performs data cleaning, conducts the statistical analysis, and outputs the results.

A second IPython notebook file, **`fig_s4_afrobarometer_site_clustering.ipynb'**, replicates Figure S4.

### How to Run the Analysis

1. Open the file **`kenya_analysis_replicate.R`**.
2. Ensure that all necessary packages are installed. The required R packages are:
   - `foreign`, `car`, `ggplot2`, `plyr`, `Rmisc`, `grid`, `stargazer`, `lmtest`, `sandwich`, `multiwayvcov`, `purrr`, `dotwhisker`, `broom`, `gtools`, `stringr`, `DataCombine`, `tidyr`, `tidyverse`, `haven`, `survey`, `data.table`, `xtable`. Etc.
3. Set the working directory in R to the location where the replication packet is stored.
4. Run the script **`kenya_analysis_replicate.R`** to reproduce the analysis.

### How to Run the Figure S4 Replication

1. Open the file **`fig_s4_afrobarometer_site_clustering.ipynb'** using Jupyter Notebook.
2. Ensure that all necessary packages are installed. The version of Python used was 3.6.10. The required Python packages are:
	pandas==1.1.0
	numpy==1.19.2
	geopandas==0.8.1
	matplotlib==3.2.2
	fiona==1.8.13.post1
	shapely==1.7.1
	sklearn==0.23.2
3. Run the script **`fig_s4_afrobarometer_site_clustering.ipynb'** to reproduce the analysis. As filepaths are relative, the working directory does not need to be set. 

** IMPORTANT**: Figure S4 relies on geocoded Afrobarometer cluster data which is only available by special request. We have not included this data in this replication folder; specifically, in the files "kenya replication/data sets/afrobarometer_cluster_sizes.csv" and "kenya replication/data sets/afrobarometer_kmeans_targets.csv", the "latitude" and "longitude" columns are empty and must be filled by the reader. Readers may request geocoded Afrobarometer data here: https://www.afrobarometer.org/contact-us/data-requests/. 

### Datasets

1. **`kenya_facebook.csv`**:
   - This dataset contains the survey data from Facebook users in Kenya. It is the primary dataset used for the social media analysis portion of the study.
   
2. **`kenya_facebook_plusnonresponse.csv`**:
   - This dataset is an extended version of `kenya_facebook.csv`, including additional responses from individuals who did not complete the survey. It is used for handling nonresponse analysis in the study.

3. **`census_tables_cleaned_VOLUME IV KHPC 2019.xlsx`**:
   - This Excel file contains cleaned census data from the 2019 Kenya Population and Housing Census (KHPC). It is used to generate population benchmarks for weighting the survey data.

4. **`ken_r8.data_.new_.final_.wtd_release.31mar21.sav`**:
   - This dataset contains data from the Afrobarometer Round 8 survey conducted in Kenya in 2019. It is used to compare survey results from Facebook data with nationally representative data from the Afrobarometer survey.

5. **`Kenya_ad_sets_dataframe.csv`**:
   - This dataset contains data downloaded from Facebook ads manager, used to run Facebook ads. It is used to analyze cost of Facebook ads based on different targeting strategies. 

6. **`afrobarometer_kmeans_targets.csv`**:
   - This dataset contains the kmeans clusters used for the geolocated targeting of Facebook ads in Kenya.
   

6. **`afrobarometer_cluster_sizes.csv`**:
   - This dataset contains Facebook audience estimates for the coordinates of the survey clusters in the Afrobarometer survey.


### Note

- Ensure that all paths to the datasets are correct in the script. If the script does not automatically find the datasets, you might need to modify the paths to reflect your directory structure.
